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Abstract 

Background: The Drosophila melanogaster Serpin 42 Da gene (previously Serpin 4) encodes a serine protease 
inhibitor that is capable of remarkable functional diversity through the alternative splicing of four different reactive 
centre loop exons. Eight protein isoforms of Serpin 42 Da have been identified to date, targeting the protease 
inhibitor to both different proteases and cellular locations. Biochemical and genetic studies suggest that Serpin 
42 Da inhibits target proteases through the classical serpin 'suicide' inhibition mechanism, however the crystal 
structure of a representative Serpin 42 Da isoform remains to be determined. 

Results: We report two high-resolution crystal structures of Serpin 42 Da representing the A/B isoforms in the 
cleaved conformation, belonging to two different space-groups and diffracting to 1.7 A and 1.8 A. Structural analysis 
reveals the archetypal serpin fold, with the major elements of secondary structure displaying significant homology 
to the vertebrate serpin, neuroserpin. Key residues known to have central roles in the serpin inhibitory mechanism 
are conserved in both the hinge and shutter regions of Serpin 42 Da. Furthermore, these structures identify 
important conserved interactions that appear to be of crucial importance in allowing the Serpin 42 Da fold to act 
as a versatile template for multiple reactive centre loops that have different sequences and protease specificities. 

Conclusions: In combination with previous biochemical and genetic studies, these structures confirm for the first 
time that the Serpin 42 Da isoforms are typical inhibitory serpin family members with the conserved serpin fold and 
inhibitory mechanism. Additionally, these data reveal the remarkable structural plasticity of serpins, whereby the 
basic fold is harnessed as a template for inhibition of a large spectrum of proteases by reactive centre loop exon 
'switching'. This is the first structure of a Drosophila serpin reported to date, and will provide a platform for future 
mutational studies in Drosophila to ascertain the functional role of each of the Serpin 42 Da isoforms. 
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Background 

Serpins are a large superfamily of protease inhibitors 
that were originally identified as serine protease mhibi- 
tors, but now encompass proteins that inhibit cysteine 
proteases or have non-inhibitory roles [1,2]. The serpin 
superfamily is represented in all branches of life with 
over 1500 serpins identified to date [1-3]. As such, ser- 
pins have a remarkably wide array of functions that in- 
clude roles in immune defence, the blood coagulation 
pathway, and in hormone regulation and transport [1-3]. 
Within Drosophila there are over 20 inhibitory serpins, 
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many of which modulate the innate immune response 
[4]. Of these Drosophila serpins, eight are coded by the 
single Serpin 42 Da (Spn42Da) gene with each isoform 
formed through alternative splicing of different reactive 
centre loop (RCL) exons or signalling peptides [4] . 

Serpin structures are typified by a meta-stable native 
state, with a solvent exposed RCL that serves as 'bait' to 
bind and inhibit the target protease [1]. Specific recog- 
nition of the RCL by the target protease is primarily de- 
fined by the sequence of the RCL from the PI 5 to P3' 
positions, albeit studies have also shown a role for other 
exosites in determining protease-inhibitor recognition 
[1-3]. The peptide bond between the PI and PT resi- 
dues is severed upon proteolytic attack by the target 
protease. Subsequently, the metastable serpin native 
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state undergoes a large conformational change translo- 
cating the protease to the other pole of the serpin. The 
serpin-protease complex remains covalently bound, 
forming an ester bond between the catalytic residue of 
the protease and the main chain carbonyl of the PI pos- 
ition. Thus, the protease is inhibited at the acyl-enzyme 
intermediate stage of the enzymatic cleavage reaction 
[2,3]. The resultant serpin-protease complex is highly 
stable, and effectively inhibits both the protease and ser- 
pin, leading to the description of serpins as suicide' in- 
hibitors [3]. Within the final inhibitory complex, the 
serpin is in a hyper-stable conformation with the 'hinge' 
region of RCL forming the top of the central 4th strand 
of (3-sheet A. This conformation can also spontaneously 
occur upon cleavage of the RCL loop, forming the stable 
cleaved' serpin conformation [1]. 

The Drosophila melanogaster Spn42Da gene, previously 
known as Serpin 4, encodes for eight different protein iso- 
forms. Spn42Da isoforms B, D, E/F and I have N-terminal 
signal peptides and contain different RCL sequences due 
to the alternative splicing of four RCL encoding exons 
within the Spn42Da gene (Figure 1). The remaining four 
isoforms (A, G, H/K/L and J) have the same RCL splicing 
pattern, but do not have identifiable signal peptides and 
are thought to function within the cytosol [5,6]. A similar 
RCL splicing pattern, that generates multiple serpin iso- 
forms from single genes, has been identified in nematodes 
and urochodates [5,7-9]. Spn42Da-A was the first isoform 




Figure 1 Spn42Da encodes eight protein isoforms. Two invariable 
core exons are followed by four alternatively spliced exons (l-IV) 
encoding different reactive centre loops (RCLs). For each of these, 
there is an alternative N-terminal exon encoding a secretion signal 
peptide (SP). Isoform names are according to FlyBase annotation 
(release 5.55) [6]. 

\ J 



to be characterised in Drosophila, with effective inhibition 
of proprotein convertases (PC), including human furin and 
Drosophila PC2, amontillado. Upon inhibition, Spn42Da- 
A forms a SDS-stable complex and has a stoichiometry of 
inhibition characteristic of other serpins [10-12]. Further 
biochemical studies have identified a diverse range of pu- 
tative target protease families for the different Spn42Da 
isoforms, including serine proteases of the subtilase family, 
papain-like cysteine proteases, and members of the chy- 
motrypsin family [13]. Therefore, through alternative spli- 
cing of the RCL, the Spn42Da gene is able to produce a 
wide range of intracellular and extracellular protease in- 
hibitors that are targeted towards a remarkably diverse 
range of protease families. This has led to the hypothesis 
that in addition to a role in inhibition of PCs, Spn42Da 
isoforms may be essential for immune defence by inhibit- 
ing a large spectrum of pathogenic proteolytic enzymes 
[13]. However, further work is still required to characterise 
the potential diverse range of functions of the eight 
Spn42Da isoforms within Drosophila. 

Although evidence suggests that the Spn42Da isoforms 
function as bona fide serpins and can form a covalent 
complex with target inhibitors, there are no reported 
structures of any Spn42Da isoform. Furthermore, it is 
currently unclear how the structure of the Spn42Da core 
can act as a versatile template to accommodate the 
switching of various RCL sequences whilst maintaining 
its function. In order to gain insight into these questions 
we have expressed and crystallised Spn42Da bearing the 
RCL from isoforms A and B to a high resolution. The 
Spn42Da-A/B structure (referred herein as Spn42Da) is 
in the cleaved conformation with a high degree of struc- 
tural homology to the vertebrate serpin, neuroserpin. 
The structures illustrate the plasticity of the Spn42Da 
fold, and begin to describe how this fold is able to ac- 
commodate a wide variety of sequences through the use 
of alternatively spliced RCLs. 

Results and discussion 

Crystallisation and data quality 

Crystals of Spn42Da grew after 6months to lyear at 20°C 
in two crystal forms, each with a single Spn42Da molecule 
in the asymmetric unit. The first crystal form, designated 
Spn42Da-l, crystallised in the C222 2 spacegroup (a = 
59.74 A, b = 125.95 A, c = 119.94 A) and diffracted to a re- 
solution of 1.7 A. The second crystal form, designated 
Spn42Da-2, crystallised in the 1222 spacegroup (a = 87.22 A, 
b = 109.77 A, c = 140.58 A) and diffracted to a resolution 
of 1.8 A. The structure of Spn42Da-l was determined by 
molecular replacement with cleaved neuroserpin as the 
search model (PDB code 3F02) [14]. The Spn42Da-2 
structure was solved using the Spn42Da-l structure as the 
search model for molecular replacement. 
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The structures are of high quality; the Spn42Da-l 
structure refined to a R WO rk/Rfree of 16.51% and 18.81% 
respectively, and the Spn42Da-2 structure to a R wor k/ 
R free of 16.58% and 18.54% respectively. Within the 
Ramachandran plot, the Spn42Da-l structure contains 
98.9% of residues in the favoured region and no residues 
in the outlier regions. The Spn42Da-l structure scored 
in the 100th percentile of structures in MolProbity, with 
a clashscore of 2.39 for all atoms and an overall Mol- 
Probity score of 1.02 [15]. The Spn42Da-2 structure also 
has excellent geometry, with 98.6% of amino acids in the 
favoured region of the Ramachandran plot and no out- 
liers. The Spn42Da-2 structure scored in the 100th per- 
centile in MolProbity with a clashscore of 1.02 and an 
overall MolProbity score of 0.80 [15]. The complete data 
collection statistics for the two crystal forms are re- 
ported in Table 1. 

Spn42Da-l has 369 amino acids from residue number 
4 to 387 of the molecule, with three loops (D86-Q88, 
D191-R194, K367) and seven residues of the RCL 
(A343-E349) missing due to poor electron density - 
most likely reflecting protein cleavage or their mobility 
within the protein. The Spn42Da-2 structure has 371 
residues, from residue 4 to 381 of the Spn42Da protein. 
Clear electron density is present for all loops within the 
structure except for seven residues of the RCL (R342- 



E348). The last four residues of the Spn42Da protein, 
corresponding to a likely ER retention signal, are not 
present in the density of either structure. 

The cleaved Spn42Da crystal structure 

Spn42Da has a typical serpin fold comprising a mixed a- (3 
secondary structure with an N-terminal helical region and 
a C-terminal p-barrel fold [2] (Figure 2A and B). The 
major elements of secondary structure characteristic of 
serpins are present, with a total of 3 p-sheets and 9 a- 
helices. The two crystal forms of Spn42Da are structurally 
homologous with no large conformational differences and 
an RMSD of 0.35 A across 363 aligned residues (Figure 2C). 
As such, except for the presence of an extra six C- 
terminal amino acids within the Spn42Da-l structure 
there are no major regions of difference between the 
two crystal forms. As Spn42Da-2 has the most complete 
electron density of internal loops between the two 
structures, it has been used for figures and comparative 
analysis unless otherwise stated. 

The Spn42Da RCL sequence from P4 to PI (R-R-K-R) 
corresponds to a classic furin-like consensus recognition 
sequence, with experimental evidence suggesting that 
cleavage occurs after the PI R342 [10,12]. In the Spn42Da- 
1 crystal structure, clear electron density ends after the PI 
Arg residue, suggesting that the RCL was cleaved after this 



Table 1 Data collection and refinement statistics (highest resolution shell in brackets) 



Spn42Da-1 



Spn42Da-2 



Data collection statistics 

Beamline 

Oscillation range (°) 
Space group 
Cell parameters 
Resolution Range 
Observed reflections 
Unique reflections 
Completeness (%) 

Emerge 
I/O (I) 

Refinement statistics 

Resolution range (A) 
No. of protein atoms 
No. of water atoms 

Rwork/Rfree 

Rmsd from ideal bond length ( 
angles (°) 

Ramachandran plot (%) 
Favoured region 
Outlier region 



Australian Synchrotron MX2 
1 

C 2 2 21 

a = 59.74, b = 1 25.95, c = 1 1 9.94. a = 90, (3 = 90, y = 90. 

27.88- 1.7 (1.79 - 1.7) 

378728 

50081 (7222) 

99.9 (100.00) 

0.100 (0.672) 

12.7 (3.1) 

27.88 - 1.7 (1.761 - 1.7) 

2927 

344 

0.1651/0.1881 (0.2341/0.2497) 

0.006 

1.03 



99 
0 



Australian Synchrotron MX2 
1 

I 2 2 2 

a = 87.22, b = 1 09.77, c = 1 40.58. a = 90, (3 = 90, y = 90. 

40.0 - 1.80 (1.9 - 1.8) 

398398 

62423 (8962) 

99.6 (98.9) 

0.128 (0.935) 

7.3 (1.5) 

37.06 - 1.80(1.864 - 1.8) 

2946 

377 

0.1658/0.1854 (0.2774/0.3012) 

0.008 

1.09 



99 
0 
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consensus recognition site between the PI Arg and the PI' 
Ala residues. Whereby in the Spn42Da-2 structure, clear 
electron density is only present up to the P2 Lys residue. 
Spn42Da is in the highly stable cleaved' conformation, 
with the cleaved RCL inserted into p-sheet A to form the 
4th strand in the six stranded central p-sheet (Figure 3 A) 
[1]. No density corresponding to residues PI' to P7' is 
present within the crystal structures, thus confirming they 
represent the cleaved confirmation and not the latent con- 
formation whereby the intact RCL is inserted into p-sheet 
A [3]. Spn42Da was purified in the native conformation, 
with cleavage of the RCL most likely occurring during the 
extended crystallisation time. It is common for the RCL to 
be cleaved within the crystallisation solution over time, 
thereby allowing the protein to readily crystalize in the 
stable cleaved' conformation. This is the first reported 
crystal structure of a Drosophila serpin, and confirms that 
the Spn42Da isoforms have a protein fold typical of 



serpins. Furthermore, the ability of the RCL to insert into 
p-sheet A confirms the capability of the Spn42Da protein 
to transit from the metastable native state to a stable RCL 
inserted confirmation typical of inhibitory serpins. Com- 
bined with the biochemical and genetic evidence provided 
by previous studies [10-12], these structures confirm that 
the Spn42Da isoforms indeed act as typical serpins, cap- 
able of inhibiting diverse families of proteases by the clas- 
sic serpin suicide' inhibition mechanism. 

The serpin inhibitory mechanism is conserved in Spn42Da 

Serpin structures from other insects, including serpin IK 
from Manduca sexta and SPN48 from Tenebrio molitor, 
have been solved in their native conformation [16,17]. 
These structures have the typical fold of other native- 
state inhibitory serpins, whereby the RCL extends into 
the solvent to act as bait for the target protease. Of the 
vertebrate serpins, Spn42Da is most closely related to 




Figure 3 Conserved homology between Spn42Da and neuroserpin. A, Cartoon representation of cleaved Spn42Da highlighting the (3-sheet 
A in red and the inserted RCL in blue with the strands numbered. B, Crystal structure of native neuroserpin (PDB code 3FGQ [18]) coloured as per 
(A). C, Crystal structure of cleaved neuroserpin (PDB code 3F02 [14]) coloured as per (A). D, Cartoon representation of cleaved neuroserpin (grey; 
PDB code 3F02 [14]) superimposed with cleaved Spn42Da (blue). E, Ribbon representation of native neuroserpin (green; PDB code 3FGQ [18]) 
superimposed with cleaved Spn42Da (blue). 
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mammalian neuroserpin with 34% sequence similarity and 
shared functional characteristics [11]. Indeed, the sequence 
similarity between Spn42Da and neuroserpin is very close 
to that shared between Spn42Da and the other insect ser- 
pins solved to date (-35%). Here, we have exploited the 
availability of highly characterised structures of both the 
native and cleaved conformations of neuroserpin, in 
addition to the similar functionality and high sequence 
similarity to Spn42Da. This has allowed us to gain insight 
into the conserved inhibitory mechanism of Spn42Da, by 
directly comparing our Spn42Da structure with the two 
(native and cleaved) neuroserpin structures [14,18]. 

Previous studies have shown that the RCL of native 
neuroserpin extends into the solvent with partial a- 
helical structure (PDB 3FGQ) [18] (Figure 3B). Upon 
cleavage, the RCL inserts into the molecule forming a 
single strand (S4A) of the central |3-sheet A (PDB 3F02) 
(Figure 3C) [14]. This conformational change between 
native and cleaved neuroserpin is representative of the 
structural transition of the majority of other charac- 
terised inhibitory serpins [1,19]. However differences in 
the initial RCL position are apparent in some serpins, in- 
cluding mammalian antithrombin and insect SPN48, 
where the RCL hinge is partially inserted into the 
'breach' region of (3-sheet A forming a short (3-strand 4A 
[17,20]. Attempts are ongoing to crystallise Spn42Da in 
its native conformation in order to determine the precise 
conformation of the RCL (extended or partially inserted) 
and the implications of this orientation for Spn42Da 
activity. 

Cleaved Spn42Da and cleaved neuroserpin (PDB 3F02) 
are highly homologous with an RMSD of 1.25 A across 
343 aligned residues [14] (Figure 3D). The major elements 
of secondary structure align: both Spn42Da and neuroser- 
pin are in the typical cleaved conformation with the 
inserted RCL forming one (S4A) of the 6 strands within 
the central (3-sheet A (Figure 3D). The major difference 
between the two structures occurs in the position and 
length of the connecting loop regions. Spn42Da and native 
neuroserpin superpose with a RMSD of 2.2 A across 327 
aligned residues (Figure 3E). The |3-sheet A is smaller by a 
single strand in native neuroserpin which gives the mol- 
ecule a more compact fold. The C a positions of Spn42Da 
helices B, D, E, and F and (3-sheet A strands 1-3 undergo 
the greatest displacement upon cleavage and subsequent 
insertion of the RCL into (3-sheet A (Figure 3E). This sec- 
ondary structural movement is consistent with models of 
RCL cleavage and insertion for neuroserpin and other 
serpins [3,18]. 

The shutter region is a conserved cluster of amino 
acids that provide key interactions for controlling the 
opening of the central (3-sheet A and insertion of the 
RCL [18,21]. This region is therefore of critical import- 
ance for the serpin inhibitory mechanism and protein 



stability. Sequence alignment of Spn42Da and neuroser- 
pin identified conserved residues in the shutter region 
between the two proteins, suggesting a uniform inhibi- 
tory mechanism (Figure 4A). Specifically, five residues 
that have previously been identified to play a key role in 
the hydrogen bonding network of the shutter region, are 
highly conserved between neuroserpin and Spn42Da 
thus allowing us to analyse the likely changes that occur 
in Spn42Da upon protease inhibition and RCL insertion. 
Indeed, these residues are highly conserved across the 
entire serpin superfamily, with sequence alignment of 
over 200 serpins highlighting that S36, S39, N166, and 
H317 (Spn42Da numbering) are the most common resi- 
dues to be found at these positions within the shutter re- 
gion [21]. 

In native neuroserpin, the shutter region is composed of 
a hydrogen bonding network between the central H338 
and S340 on strand S5A, residues S49 and S52 at the top 
of helix B, and N182 on strand S3A [18] (Figure 4B). In 
comparison, the shutter region of cleaved Spn42Da be- 
tween S5A and S3A has opened to accommodate the RCL 
which forms strand S4A of (3-sheet A (Figure 4C). Residue 
N166 of strand S3A and S39 on helix B are heavily dis- 
placed when T334 of the RCL inserts into (3-sheet A as it 
opens to form the cleaved conformation. Despite this 
major change in secondary structure, a hydrogen bonding 
network within the shutter region is retained in the 
cleaved conformation. RCL residue T334 likely forms mul- 
tiple hydrogen bonds with surrounding key conserved 
amino acids including N166, S39 and H317. Therefore, 
the residues within the shutter region that are required to 
accommodate the large conformational changes that occur 
in the transition from the native state to the hyperstable 
cleaved conformation, are conserved in Spn42Da. These 
data suggest that the classic serpin inhibitory mechanisms 
are conserved in the Spn42Da isoforms. 

Structural basis for gene splicing and RCL switching 

The most remarkable aspect of the Spn42Da gene is the 
capacity of the RCL exons to be variably spliced, with the 
eight protein isoforms exhibiting differential cellular local- 
isation and protease targets [5,10-13]. Sequence alignment 
of the Spn42Da isoforms and neuroserpin reveals se- 
quence conservation in the critical hinge region and in 
two clusters of the variably spliced RCL (Figure 5A). Each 
of the isoforms display clear variability across the con- 
sensus recognition sequence upstream of the critical PI 
position. This variability affects the activity of the isoforms 
for different protease families. Previous studies identified 
Spn42Da-A/B as a potent inhibitor of furins, and the 
remaining variants as inhibitors of cathepsins, chymotryp- 
sin and elastases [10-13]. 

The cleaved Spn42Da structure provides our first struc- 
tural insight into how RCL switching is accommodated by 
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A 



20 '.*.'.* . ***;***;** 
Spn42Da VYGKLSGQKPGENIVFSPFSI 
N-serpin MYNRLRATGEDENILFSPLSI 

33 

162 *.*:**::***.*: ** 

Spn42Da LVLVNAIHFKGTWQHQFAKHL 
N-serpin LAL I NAVYFKGNWKS QFRPEN 
178 

316 ***;*;******;****;** 
Spn42Da IHKAFIEVNEEGTEAAAATGM 
N-serpin IHKSFLEVNEEGSEAAAVSGM 

337 





Figure 4 Conserved role of the shutter region for RCL insertion into the A (3-sheet. A, Sequence alignment of selected residues of the 
shutter region between Spn42Da and neuroserpin. Highlighted residues and hydrogen bonding network of the shutter region in native 
neuroserpin (PDB code 3FGQ [18]) (B) and cleaved Spn42Da (C). Spn42Da 2mFo-Dfc maps displayed as blue mesh contoured at 2.0 o and 
putative hydrogen bonds displayed as red dashed lines. 



the Spn42Da protein fold (Figure 5B). Spn42Da is 
cleaved after the P2 K341, with residues R342-E348 of 
the variably spliced region RCL not evident in the struc- 
ture. These missing residues are either mobile in the 
crystal structure or missing due to the result of further 
proteolysis (Figure 5B). Clear electron density is present 
for the highly conserved hinge region comprised of rela- 
tively small amino acids, with residues N324 to T334 
forming the S5A-S4A loop and the top of S4A within 
the central p-sheet A (Figure 5B). This hinge region is 
highly conserved between Spn42Da and neuroserpin, 
highlighting the recognised importance of this critical 
flexible region in the serpin inhibitory mechanism [2,3]. 
Residues G335 to V338 of the variably spliced region 



are accommodated into the bottom of strand S4A, with 
the larger residues of the consensus recognition se- 
quence positioned into the solvent at the bottom of 
strand S4A, making minimal important interactions. 
Within the variably spliced region there are only two 
clusters of amino acids that are highly conserved and 
appear crucial to incorporating the variably spliced RCL 
exons into the serpin scaffold to produce active protease 
inhibitors. Residues H357, P358, and F359 are com- 
pletely conserved between the Spn42Da isoforms and 
neuroserpin, and form essential interactions in the turn 
leading into, and the beginning of strand S4B. Conserved 
residues F372 and G374 also appear vital in maintaining 
interactions between the interface of p-sheet B and p-sheet 



A 



Hinge 

324 ****;**** # 
Spn42Da A/B NEEGTEAAAA 
Spn42Da D/G NEEGTEAAAA 
Spn42Da E/H NEEGTEAAAA 
Spn42Da I/J NEEGTEAAAA 
N-serpin NEEGSEAAAV 



Variably Spliced RCL 



TGMAVRRKRA IMSPEEPIEF FADHPFTYVL 
TGMFMSLTSL PMPKPDPIRF NVDHPFTFYI 
TGMVMCYASM LTFEPQPVQF HVQHPFNYYI 

TVWRVMAVAA -FSR KHF IANHPFAFYV 

SGMIAISRMA -VLYP QV IVDHPFFFLI 



364 : . :* * 

Spn42Da A/B VHQK-DLPLF WGSWRLEEN TFASSEjHDEL 

Spn42Da D/G LNKD-STALF AGSIKKL 

Spn42Da E/H INKD-STILF AGRINKF 

Spn42Da I/J KTHY-DLPIF TGRYLG 

N-serpin RNRRTGTILF MGRVMHPETM NTSGHDlFEEL 



ER-Retention Signal 




Figure 5 Structural basis for gene splicing of the Spn42Da RCL. A, Sequence alignment of the variably spliced RCLs and surrounding regions 
from the eight Spn42Da variants. B, Spn42Da structure with the variably spliced region highlighted in blue and the hinge region in red. Selected 
RCL residues are shown in stick format with the 2mFo-Dfc maps displayed as blue mesh contoured at 1.5 o. 
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A and the packing between strands of the p-sheet B, 
respectively. 

These data highlight the remarkable ability of the ser- 
pin protein fold to act as an accommodating scaffold for 
variable RCL sequences. Strict conservation appears only 
required in two clusters of the variably spliced RCL, in 
order to maintain the structural integrity of the top 13- 
sheet B. After cleavage, the PI residue and consensus 
recognition sequence are positioned at the bottom of 
strand S4A, where they make few critical interactions 
and therefore show high sequence variability. As such, 
the Spn42Da protein fold can accommodate the high se- 
quence variability across the Spn42Da isoforms, allowing 
for extensive versatility in targeting a range of protease 
families. 

Conclusions 

We have solved the first crystal structures of Spn42Da 
from Drosophila melanogaster in the cleaved conform- 
ation, with two different crystal forms diffracting to 1.7 
and 1.8 A resolution. These data confirm for the first time 
that Spn42Da is a bona fide serpin, with the typical protein 
fold of this family of protease inhibitors. The Spn42Da 
structure has a high degree of homology to the mamma- 
lian neural serpin, neuroserpin, albeit with minor differ- 
ences in the loop regions connecting the major elements 
of secondary structure. Structural comparison between 
Spn42Da and neuroserpin defines a likely conserved in- 
hibitory mechanism, with sequence conservation of crit- 
ical residues in both the hinge and shutter regions of 
Spn42Da. Importantly, the Spn42Da structure illustrates 
the structural features of the Spn42Da protein fold that 
are crucial in allowing it to act as a template that can be 
directed to inhibit diverse protease families through RCL 
switching. Furthermore, the Spn42Da structure provides 
the basis for future mutational targeting of Spn42Da iso- 
forms to understand their various roles in Drosophila, 

Methods 

Cloning, expression, purification 

PCR amplified Spn42Da cDNA was cloned into a pET3a 
vector (Novagen) as an untagged protein corresponding 
to residues 1-392. Spn42Da was expressed overnight in 
Rosetta2 (DE3) pLysS cells (Novagen) by IPTG induc- 
tion at 16°C. The cells were lysed by sonication in 
50 mM Tris-HCl (pH 8.0), 150 mM NaCl, 5 mM |3- 
mercaptoethanol, and a complete EDTA-free protease 
inhibitor tablet (Roche). The lysate was clarified by cen- 
trifugation, filtered through a 0.45 urn membrane, and 
diluted at a 1:1 ratio in buffer containing 50 mM Tris- 
HCl (pH 8.0), and 5 mM p-mercaptoethanol. The super- 
natent was loaded onto a 5 ml Hitrap Q HP column 
(GE Healthcare), and eluted with a gradient from 
50 mM Tris-HCl (pH 8.0), 50 mM NaCl, and 5 mM 



(3-mercaptoethanol to 50 mM Tris-HCl (pH 8.0), 1 M 
NaCl, and 5 mM (3-mercaptoethanol. Fractions contain- 
ing Spn42Da were combined and dialysed against buffer 
containing 50 mM Tris-HCl (pH 8.0), 20 mM NaCl, 
and 5 mM (3-mercaptoethanol. The Hitrap Q purifica- 
tion stage was repeated two times to increase the purity 
of Spn42Da. The resultant fractions were applied to a 
Superdex 75 16/60 prep-grade column (GE Healthcare) 
and eluted in 25 mM Tris-HCl (pH 8.0), 75 mM NaCl, 
5 mM |3-mercaptoethanol, and 0.02% (w/v) NaN 3 . 

Crystallisation, and structure determination 

Spn42Da-l crystals were grown at 20°C by hanging drop 
vapour diffusion in 0.2M lithium chloride, 20% (w/v) 
PEG3350. Spn42Da-2 crystals were grown at 20°C by 
hanging drop vapour diffusion in 0.2M ammonium phos- 
phate monobasic, 20% (w/v) PEG3350. Spn42Da crystals 
were flash cooled in liquid nitrogen in mother liquor con- 
taining 20% ethylene glycol. Crystallographic data were 
collected at the Australian Synchrotron at the MX2 beam- 
line [22]. Data were processed and scaled using iMOSFLM 
and programs within the CCP4 suite [23]. The Spn42Da-l 
structure in the C222 2 spacegroup was solved by obtaining 
initial phases by molecular replacement using a cleaved 
neuroserpin structure [14] (PDB code 3F02) as the search 
model in PHENIX using the Phaser program [24]. The 
structure was automatically built in ArpWarp and iterative 
cycles of refinement were carried out using PHENIX 
Refine and REFMAC5 [24-26]. Local rebuilding was per- 
formed in Coot [27], resulting in a model with an R- factor 
of 16.51% CR fr ee of 18.81%) and excellent geometry 
(Table 1). The Spn4A-2 structure in the 1222 spacegroup 
was determined using the Spn42Da-l structure as the mo- 
lecular replacement search model with refinement and re- 
building carried out as per the Spn42Da-l structure. The 
resultant model has an ^-factor of 16.58% (R free of 18.54%) 
with excellent geometry (Table 1). All figures were made 
using PyMol and coordinates have been deposited at the 
Protein Data Bank with accession codes 4P0F and 4P0O 
for the Spn42Da-l and Spn42Da-2 structures respectively. 
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